102        Bioinformatics

Figure 3.9). Use the mouse to display these three plots and also move the mouse to obtain

immediate information about the scaffold.

The assembly metrics shown in Figure 3.6 are the ones used for assembly assessment.

The important metrics are the total length of the assembly, N50, N75, L50, and L75. Figure

3.8 shows the metrics for three assemblies in three columns. The two assemblies generated

by SPAdes (non-hybrid and hybrid) are better than the assembly generated by ABySS. The

largest N50 value, the lowest L50 value, and longest assembly are indicative measures of

better assembly. The total lengths of the three assemblies are 4,609,549b (4.609549Mb),

4,686,651b (4.686651Mb), and 4,623,212b (4.623212Mb), respectively, which are close to

the length of the reference genome of E. coli str. K-12, which is 4.641650Mb. Notice also

that SPAdes assemblies have the largest N50 and N75 and the lowest L50 and L75.

The QUAST report also includes Icarus contig viewer, which is a genome viewer based

on QUAST for the assessment and analysis of genomic draft assemblies. Click “View in

Icarus contig browser” on the top right to display the Icarus QUAST contig browser.

In Figure 3.10, the Icarus visualizer shows contig size viewer, on which contigs are ordered

from the longest to the shortest, highlights N50, N75 (NG50, NG75) and long contigs larger

than a user-specified threshold. To learn more about Icarus viewer and QUAST output

reports, refer to the QUAST manual at “http://cab.cc.spbu.ru/quast/manual.html#sec3.4”.

If you need to assess the assembly using a reference as a guide, you can download the

FASTA file of a reference genome (Genome) with its annotation file (GFF) from a database

FIGURE 3.9  QUAST assembly assessment plots.

FIGURE 3.10  Icarus contig browser.